AITopics

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-14-2026, 13:55:08 GMT

5c54e016197805946481d786d80a662e-Paper-Conference.pdf

correlation, machine learning, natural language, (18 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)
Asia > China > Zhejiang Province (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Neural Information Processing SystemsDec-24-2025, 20:08:19 GMT

Comprehensive Knowledge Distillation with Causal Intervention

The existing distillation approaches mainly focus on using different criteria to align the sample representations learned by the student and the teacher, while they fail to transfer the class representations. Good class representations can benefit the sample representation learning by shaping the sample representation distribution. On the other hand, the existing approaches enforce the student to fully imitate the teacher while ignoring the fact that the teacher is typically not perfect. Although the teacher has learned rich and powerful representations, it also contains unignorable bias knowledge which is usually induced by the context prior (e.g., background) in the training data. To address these two issues, in this paper, we propose comprehensive, interventional distillation (CID) that captures both sample and class representations from the teacher while removing the bias with causal intervention. Different from the existing literature that uses the softened logits of the teacher as the training targets, CID considers the softened logits as the context information of an image, which is further used to remove the biased knowledge based on causal inference. Keeping the good representations while removing the bad bias enables CID to have a better generalization ability on test data and a better transferability across different datasets against the existing state-of-the-art approaches, which is demonstrated by extensive experiments on several benchmark datasets.

comprehensive knowledge distillation, name change, representation, (8 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Neural Information Processing SystemsDec-23-2025, 17:33:25 GMT

Causal Intervention for Weakly-Supervised Semantic Segmentation

We present a causal inference framework to improve Weakly-Supervised Semantic Segmentation (WSSS). Specifically, we aim to generate better pixel-level pseudo-masks by using only image-level labels -- the most crucial step in WSSS. We attribute the cause of the ambiguous boundaries of pseudo-masks to the confounding context, e.g., the correct image-level classification of horse and person may be not only due to the recognition of each instance, but also their co-occurrence context, making the model inspection (e.g., CAM) hard to distinguish between the boundaries. Inspired by this, we propose a structural causal model to analyze the causalities among images, contexts, and class labels. Based on it, we develop a new method: Context Adjustment (CONTA), to remove the confounding bias in image-level classification and thus provide better pseudo-masks as ground-truth for the subsequent segmentation model. On PASCAL VOC 2012 and MS-COCO, we show that CONTA boosts various popular WSSS methods to new state-of-the-arts.

causal intervention, name change, weakly-supervised semantic segmentation, (5 more...)

Technology: Information Technology > Artificial Intelligence (0.40)

arXiv.org Artificial IntelligenceDec-10-2025

Empowerment Gain and Causal Model Construction: Children and adults are sensitive to controllability and variability in their causal interventions

Yiu, Eunice, Allen, Kelsey, Ginosar, Shiry, Gopnik, Alison

Learning about the causal structure of the world is a fundamental problem for human cognition. Causal models and especially causal learning have proved to be difficult for large pretrained models using standard techniques of deep learning. In contrast, cognitive scientists have applied advances in our formal understanding of causation in computer science, particularly within the Causal Bayes Net formalism, to understand human causal learning. In the very different tradition of reinforcement learning, researchers have described an intrinsic reward signal called "empowerment" which maximizes mutual information between actions and their outcomes. "Empowerment" may be an important bridge between classical Bayesian causal learning and reinforcement learning and may help to characterize causal learning in humans and enable it in machines. If an agent learns an accurate causal world model, they will necessarily increase their empowerment, and increasing empowerment will lead to a more accurate causal world model. Empowerment may also explain distinctive features of childrens causal learning, as well as providing a more tractable computational account of how that learning is possible. In an empirical study, we systematically test how children and adults use cues to empowerment to infer causal relations, and design effective causal interventions.

artificial intelligence, empowerment, machine learning, (17 more...)

2512.0823

Country: North America > United States > California (0.28)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)

Grant, Satchel, Han, Simon Jerome, Tartaglini, Alexa R., Potts, Christopher

Addressing divergent representations from causal interventions on neural networks

arXiv.org Artificial IntelligenceDec-2-2025

A common approach to mechanistic interpretability is to causally manipulate model representations via targeted interventions in order to understand what those representations encode. Here we ask whether such interventions create out-of-distribution (divergent) representations, and whether this raises concerns about how faithful their resulting explanations are to the target model in its natural state. First, we demonstrate theoretically and empirically that common causal intervention techniques often do shift internal representations away from the natural distribution of the target model. Then, we provide a theoretical analysis of two cases of such divergences: "harmless" divergences that occur in the behavioral null-space of the layer(s) of interest, and "pernicious" divergences that activate hidden network pathways and cause dormant behavioral changes. Finally, in an effort to mitigate the pernicious cases, we apply and modify the Counterfactual Latent (CL) loss from Grant (2025) allowing representations from causal interventions to remain closer to the natural distribution, reducing the likelihood of harmful divergences while preserving the interpretive power of the interventions. Together, these results highlight a path towards more reliable interpretability methods.

large language model, machine learning, natural language, (21 more...)

2511.04638

Country:

North America > United States (0.28)
North America > Mexico (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceNov-7-2025

Cross-modal Causal Intervention for Alzheimer's Disease Prediction

Jin, Yutao, Xiao, Haowen, Zhai, Junyong, Li, Yuxiao, Chu, Jielei, Lv, Fengmao, Li, Yuxiao

Mild Cognitive Impairment (MCI) serves as a prodromal stage of Alzheimer's Disease (AD), where early identification and intervention can effectively slow the progression to dementia. However, diagnosing AD remains a significant challenge in neurology due to the confounders caused mainly by the selection bias of multi-modal data and the complex relationships between variables. To address these issues, we propose a novel visual-language causality-inspired framework named Cross-modal Causal Intervention with Mediator for Alzheimer's Disease Diagnosis (MediAD) for diagnostic assistance. Our MediAD employs Large Language Models (LLMs) to summarize clinical data under strict templates, therefore enriching textual inputs. The MediAD model utilizes Magnetic Resonance Imaging (MRI), clinical data, and textual data enriched by LLMs to classify participants into Cognitively Normal (CN), MCI, and AD categories. Because of the presence of confounders, such as cerebral vascular lesions and age-related biomarkers, non-causal models are likely to capture spurious input-output correlations, generating less reliable results. Our framework implicitly mitigates the effect of both observable and unobservable confounders through a unified causal intervention method. Experimental results demonstrate the outstanding performance of our method in distinguishing CN/MCI/AD cases, outperforming other methods in most evaluation metrics. The study showcases the potential of integrating causal reasoning with multi-modal learning for neurological disease diagnosis.

confounder, large language model, machine learning, (19 more...)

2507.13956

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (0.87)

Industry: Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsOct-10-2025, 03:49:37 GMT

5c54e016197805946481d786d80a662e-Paper-Conference.pdf

confounder, correlation, recognition, (15 more...)

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Singapore (0.04)
Asia > China > Zhejiang Province (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > Strength High (0.68)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Sanyal, Debdeep, Nagpal, Aaryan, Kumar, Dhruv, Mandal, Murari, Deshpande, Saurabh

time2time: Causal Intervention in Hidden States to Simulate Rare Events in Time Series Foundation Models

arXiv.org Artificial IntelligenceOct-7-2025

While transformer-based foundation models excel at forecasting routine patterns, two questions remain: do they internalize semantic concepts such as market regimes, or merely fit curves? And can their internal representations be leveraged to simulate rare, high-stakes events such as market crashes? To investigate this, we introduce activation transplantation, a causal intervention that manipulates hidden states by imposing the statistical moments of one event (e.g., a historical crash) onto another (e.g., a calm period) during the forward pass. This procedure deterministically steers forecasts: injecting crash semantics induces downturn predictions, while injecting calm semantics suppresses crashes and restores stability. Beyond binary control, we find that models encode a graded notion of event severity, with the latent vector norm directly correlating with the magnitude of systemic shocks. Validated across two architecturally distinct TSFMs, Toto (decoder only) and Chronos (encoder-decoder), our results demonstrate that steerable, semantically grounded representations are a robust property of large time series transformers. Our findings provide evidence for a latent concept space that governs model predictions, shifting interpretability from post-hoc attribution to direct causal intervention, and enabling semantic "what-if" analysis for strategic stress-testing.

machine learning, natural language, regime, (21 more...)

2509.05801

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

arXiv.org Artificial IntelligenceSep-22-2025

CIDER: A Causal Cure for Brand-Obsessed Text-to-Image Models

Shen, Fangjian, Liang, Zifeng, Wang, Chao, Wen, Wushao

Text-to-image (T2I) models exhibit a significant yet under-explored "brand bias", a tendency to generate contents featuring dominant commercial brands from generic prompts, posing ethical and legal risks. We propose CIDER, a novel, model-agnostic framework to mitigate bias at inference-time through prompt refinement to avoid costly retraining. CIDER uses a lightweight detector to identify branded content and a Vision-Language Model (VLM) to generate stylistically divergent alternatives. We introduce the Brand Neutrality Score (BNS) to quantify this issue and perform extensive experiments on leading T2I models. Results show CIDER significantly reduces both explicit and implicit biases while maintaining image quality and aesthetic appeal. Our work offers a practical solution for more original and equitable content, contributing to the development of trustworthy generative AI.

artificial intelligence, machine learning, natural language, (16 more...)

2509.15803

Country: Asia (0.28)

Genre: Research Report > New Finding (0.48)

Industry:

Consumer Products & Services > Restaurants (0.93)
Law (0.86)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)